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ENHANCEMENT OF GENE EXPRESSION 

This invention relates to a method and material for enhancing gene expression 
in organisms, particularly in plants. One particular, but not exclusive, application of the 
5 invention is the enhancement of caroteniod biosynthesis in plants such as tomato 
(Lycopersicon spp.) 

In order to increase production of a protein by an organism, it is known 
practice to insert into the genome of the target organism one or more additional copies 
of the protein-encoding gene by genetic transformation. Such copies would normally 

10 be identical to a gene which is already present in the plant or, alternatively, they may be 
identical copies of a foreign gene. In theory, multiple gene copies should, on 
expression cause the organism to produce the selected protein in greater than normal 
amounts, this is referred to as "overexpression". Experiments have shown however, 
that low expression or no expression of certain genes can result when multiple copies 

15 of the gene are present. (Napoli et al 1990 and Dorlhac de Borne el al 1994). This 
phenomenon is referred to as co-suppression. It most frequently occurs when 
recombinant genes are introduced into a plant already containing a gene similar in 
nucleotide sequence. It has also been observed in endogenous plant genes and 
transposable elements. The effects of co-suppression are not always immediate and can 

20 be influenced by developmental and environmental factors in the primary transformants 
or in subsequent generations. 

The general rule is to transform plants with a DN A sequence the codon usage 
of which approximates to the codon frequency used by the plant. Experimental analysis 
has shown that introducing a second copy of a gene identical in sequence to a gene 

25 already in the plant genome can result (in some instances) with the expression of the 
transgene, endogenous gene or both genes being inactivated (co-suppression). The 
mechanisms of exactly how co-suppression occurs are unclear, however there are 
several theories incorporating both pre- and post-gene transcriptional blocks. 

As a rule the nucleotide sequence of an inserted gene is "optimised" in two 

30 respects. The codon usage of the inserted gene is modified to approximate to the 
preferred codon usage of the species into which the gene is to be inserted. Inserted 
genes may also be optimised in respect of the nucleotide usage with the aim of 
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approximating the purine to pyrirnidine ratio to that commonly found in the target 
species. When genes of bacterial origin are transferred to plants, for example, it is well 
known that the nucleotide usage has to be altered to avoid highly adenylated regions, 
common in bacterial genes, which may be misread by the eukaryotic expression 
5 machinery as a polyadenylation signal specifying termination of translation, resulting in 
truncation of the polypeptide. This is all common practice and is entirely logical that an 
inserted sequence should mimic the codon and nucleotide usage of the target organism 
for optimum expression. 

An object of the present invention is to provide means by which co-suppression 

10 may be obviated or mitigated. 

According to the present invention there is provided a method of enhancing 
expression of a selected protein by an organism having a gene which produces said 
protein, comprising inserting into a genome of the said organism a DN A the nucleotide 
sequence of which is such that the RNA produced on transcription is different from but 

15 the protein produced on translation is the same as that expressed by the gene already 
present in the genome. 

The invention also provides a gene construct comprising in sequence a 
promoter which is operable in a target organism, a coding region encoding a protein 
and a termination signal characterised in that the nucleotide sequence of the said 

20 construct is such that the RNA produced on transcription is different from but the 
protein produced on translation is the same as that expressed by the gene already 
present in the genome. 

The inserted sequence may have a constitutive promoter or a tissue or 
developmental preferential promoter. 

25 It is preferred that the promoter used in the inserted construct be different from 

that used by the gene already present in the target genome. However, our evidence 
suggests that it may be sufficient that the region between the transcription and 
translation initiation codons, sometimes referred to as the "5' intervening region" , be 
different. In other words, the co-suppression phenomenon is probably associated with 

30 the transcription step of expression rather than the translation step: it occurs at the 
DN A or RNA levels or both. 
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The invention further provides transgenic plants having enhanced ability to 
express a selected gene and seed and propagating material derived from the said plant. 

This invention is of general applicability to the expression of genes but will be 
illustrated in one specific embodiment of our invention by a method of enhancing 
5 expression of the phytoene synthase gene which is necessary for the biosynthesis of 
carotenoids in plants, the said overexpression being achieved by the use of a modified 
transgene having a different nucleotide sequence from the endogenous sequence. 

Preferably said modified phytoene synthase gene has the sequence SEQ-ID- 

10 NO-1. 

The invention also provides a modified chloroplast targeting sequence 
comprising nucleotides 1 to 417 of SEQ-ID-NO-1. 

In simple terms, our invention requires that protein expression be enhanced by 
inserting a gene construct which is altered, with respect to the gene already present in 
15 the genome, by maximising the dissimilarity of nucleotide usage while maintaining 
identity of the encoded protein. In other words, the concept is to express the same 
protein from genes which have different nucleotide sequences within their coding 
region and, preferably the promoter region as well. It is desirable to approximate the 
nucleotide usage (the purine to pyrimidine ratio) of the inserted gene to that of the 
20 gene already present in the genome. We also believe it to be desirable to avoid the use 
of codons in the inserted gene which are uncommon in the target organism and to 
approximate the overall codon usage to the reported codon usage for the target 
genome. 

The degree to which a sequence may be modified depends on the frequency of 
25 degenerate codons. In some instances a high proportion of changes may be made, 
particularly to the third nucleotide of a triplet, resulting in a low DNA (and 
consequently RNA) sequence homology between the inserted gene and the gene 
already present while in other cases, because of the presence of unique codons, the 
number of changes which are available may be low. The number of changes which are 
30 available can be determined readily by a study of the sequence of the gene which is 
already present in its degeneracy. 
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To obtain the gene for insertion in accordance with this invention it may be 
necessary to synthesise it. The general parameters within which the nucleotide 
sequence of the synthetic gene compared with the gene already present may be 
selected are: 

5 1 . Minimise the nucleotide sequence similarity between the synthetic gene and the 
gene already present in the plant genome; 

2. Maintain the identity of the protein encoded by the coding region; 

3. Maintain approximately the optimum codon usage indicated for the target 
genome; 

10 4. Maintain approximately the same ratio of purine to pyrimidine bases; and 
5. Change the promoter or, at least, the 5' -intervening region. 

We have worked with the phytoene synthase gene of tomato. The DNA 
sequence of the endogenous phytoene sequence is known (EMBL Accession Number 
Y00521): and it was discovered that this gene contained two sequencing errors toward 

15 the 3' end. These errors were corrected in the following way (1) cancel the cytosine at 
location 1365 and (2) insert a cytosine at 1421. The corrected phytoene synthase 
sequence (Bartley et al 1992), is given herein as SEQ-ID-NO-2. Beginning with that 
natural sequence we selected modifications according to the parameters quoted above 
and synthesised the modified gene which we designated MTOM5 and which has the 

20 sequence SEQ-ID-NO-1. Figure 1 herewith shows an alignment of the natural and 

synthesised gene with retained nucleotides indicated by dots and alterations by dashes. 
The modified gene MTOM5 has 63% homology at the DNA level, 100% at the protein 
level and the proportion of adenine plus thymidine (i.e. the purines) is 54% in the 
modified gene compared with 58% for the natural sequence. 

25 In the sequence listings provided herewith, SEQ ID NO 1 is the DNA sequence 

of the synthetic (modified TOMS) gene rewferred to as MTOM5 in Figure 1, SEQ ID 
NO 2 is the natural genomic phytoene synthase (Psyl) gene referred to as GTOM5 in 
Figure 1, and SEQ-ID NO 3 is the translation product of both GTOM5 and MTOM5. 
In tomato (Lycopersicon esculentum), it has been shown that the carotenoid 

30 namely lycopene, is primarily responsible for the red colouration of developing fruit 
(Bird et al 1991). The production of an enzyme phytoene synthase, referred to herein 
as Psyl, is an important catalyst in the production of phytoene, a precursor of 
lycopene. 
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Psyl catalyses the conversion of geranyl geranyl diphosphate to phytoene, the 
first dedicated step in carotenoid biosynthesis. 

The regulation and expression of the active Psyl gene is necessary for the 
production of lycopene and consequently the red colouration of fruit during ripening. 
5 This can be illustrated by the yellow flesh phenotype of tomato fruits observed in a 
naturally occurring mutant in which the Psyl gene is inactive. In addition transgenic 
plants containing an antisense Psyl transgene, which specifically down regulates Psyl 
expression have also produced the yellow flesh phenotype of the ripe fruit. 

When transgenic plants expressing another copy of the Psyl gene (referred to 
10 as TOM5) placed under the control of a constitutive promoter (being the Cauliflower 
Mosaic Virus 35S promoter) were produced, approximately 30% of the primary 
transformants produced mature yellow fruit indicative of the phenomenon of co- 
suppression. Although some of the primary transformants produced an increased 
caroteniod content, subsequent generations did not exhibit this phenotype thus 
15 providing evidence that co-suppression is not always immediate and can occur in 
future generations. 

The sequence of Psyl is known and hence the amino acid sequence was 
determined. 

With reference to published Genbank genetic sequence data (Ken-nosuke Wada 
20 et al 1992.), a synthetic DNA was produced by altering the nucleotide sequence to one 
which still had a reasonable frequency of codon use in tomato, and which retained the 
amino acid sequence of Psyl. A simple swap between codons was used in cases where 
there are only two codon options, however in other cases the codons were changed 
within the codon usage bias of tomato. Nucleotide sequence analysis indicated that the 
25 synthetic DNA has a nucleotide similarity with Psyl (TOMS Bartley et al 1992) of 
63% and amino acid sequence similarity of 100%. 

The synthetic gene was then cloned into plant transformation vectors under the 
control of 35S promoter. These were then transferred into tomato plants by 
Agrobacterium transformation, and both the endogenous and the synthetic gene appear 
30 to express the protein. Analysis of the primary transformants illustrates there is no 

evidence, such as the production of yellow fruit, indicative of co-suppression between 
the two genes. 
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The present invention will now be described by way of illustration in the 
following examples. 

EXAMPLE 1 

The coding region of the cDNA which encodes tomato phytoene synthase, 
5 TOM5 (EMBL accession number Y00521) was modified since the original sequence 
contained two errors towards the 3* end of the sequence. The sequence reported by 
Bartley et al 1992 (J Biol Chem 267:5036-5039) for TOM5 cDNA homologues 
therefore differs from TOM5 (EMBL accession number Y00521). For the purpose of 
the production of the synthetic gene the sequence used is a corrected version of the 
10 TOMS cDNA which is identical to Psyl (Bartley et al 1992). 
Design of the sequence. 

1 . Potential restriction endonuclease cleavage sites were considered given the 
constraints of the amino acid sequence. Useful sites around the predicted target 
sequence cleavage site were introduced to aid subsequent manipulation of the 

15 leader. 

2. A simple swap between codons was used in cases where there are only two 
codon options (eg. lysine). In other cases codons were changed within the 
codon usage bias of tomato as given by Ken-nosuke Wada et al (codon usage 
tabulated from GenBank genetic sequence data, 1992. Nucleic Acids research 

20 20:S21 1 1-21 18). A priority was given to reducing homology and avoiding 

uncommon codons rather than producing a representative spread of codon 
usage. 

3. A BamHI site was introduced at either end of the sequence to facilitate cloning 
into the initial. At the 5* end 4 A were placed upstream of the ATG according 

25 the dicot start site consensus sequence (Cavener and Ray 1991, Eukaryotic 

start and stop translation sites. NAR 19: 3185-3192). 

4. The synthetic gene has been cloned into the vector pGEM4Z such that it can be 
translated using SP6. 

5. Restriction site, stemloop and codon usage analyses were performed, all results 
30 being satisfactory. 

6. The modified TOMS sequence was termed CGS48 or MTOM5. 
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Sequence analysis 

CGS48 AT content = 54% 

TOMS AT content = 58% 

The nucleotide homology between TOMS and CGS48 is 63%. 
5 Amino acid sequence homology is 100%. 

In summary the sequence TOM5 (Acc. No. Y00521) was extracted from the 
GenBank database and modified to incorporate the following corrections: deleted C at 
1365, inserted C at 1421. CGS48 is based on the CDS of the modified Y00521 and the 
original sequence, whilst retaining translation product homology and trying to maintain 
10 optimal tomato codon usage. 
Assembly of CGS48 
CGS48 was divided into three parts: 
CGS48A: BamHI / Kpnl 
CGS48B: KpnI/SacI 
1 5 CGS48C: Sad / BamHI 

All three were designed to be cloned on EcoRI / Hindlll fragments. The 
sequences were divided into oligonucleotide fragments following computer analysis to 
give unique complementarity in the overlapping regions used for the gene assembly. 

The oligonucleotides were synthesised on an Applied Biosystems 380B DNA 
20 synthesiser using standard cyanoethyl phosphoramidite chemistry. The oligonucleotides 
were gel purified and assembled into full length fragments using our own procedures. 

The assembled fragments were cloned into pUC18 via their EcoRI/Hindlll 
overhangs. 

Clones were sequenced bi-directionally using "forward" and "reverse" 
25 sequencing primers together with the appropriate "build" primers for the top and 
bottom strands, using the dideoxy-mediated chain termination method for plasmid 
DNA. 

Inserts from correct CGS48A, B and C clones were isolated by digestion with 
BamHI / Kpnl, Kpnl / Sad, Sad / BamHI respectively. The Kpnl and Sad ends of the 
30 BamHI / Kpnl and Sad / BamHI fragments were phosphatased. All three fragments 
were co-ligated into BamHI cut and phosphatased pGEM4Z. Clones with the correct 
sized inserts oriented with the 5' end of the insert adjacent to the Smal site were 
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identified by PCR amplification of isolated colonies and digestion of purified plasmid 
DNA with a selection of restriction enzymes. 

A CsCl purified plasmid DNA preparation was made from one of these clones. 
This clone (CGS48) was sequenced bi-directionally using "forward" and "reverse" 
5 sequencing primers together with the appropriate "build" primers for the top and 
bottom strands, using the dideoxy-mediated chain termination method for plasmid 
DNA. 

EXAMPLE 2 

Construction of the MTOM 5 vector with the CaMV 35S promoter 

10 The fragment MTOM5 (CGS48) DNA described in EXAMPLE 1 was cloned 

into the vector pJRIRi (Figure 2) to give the clone pRD13 (Figure 3). The clone 
CGS48 was digested with Smal and Xbal and then cloned into pJRIRi which was cut 
with Smal and Xbal to produce the clone pRD13 . 

EXAMPLE 3 

15 Generation and analysis of plants transformed with the vector pRD13 

The pRD13 vector was transferred to Agrobacterium tumefaciens LBA4404 (a 
micro-organism widely available to plant biotechnologists) and used to transform 
tomato plants. Transformation of tomato stem segments followed standard protocols 
(e.g. Bird et al Plant Molecular Biology 11, 651-662, 1988). Transformed plants were 

20 identified by their ability to grow on media containing the antibiotic kanamycin. Forty 
nine individual plants were regenerated and grown to maturity. None of these plants 
produced fruit which changed colour to yellow rather than red when ripening. The 
presence of the pRD13 construct in all of the plants was confirmed by polymerase 
chain reaction analysis. DNA blot analysis on all plants indicated that the insert copy 

25 number was between one and seven. Northern blot analysis on fruit from one plant 

indicated that the MTOM5 gene was expressed. Six transformed plants were selfed to 
produce progeny. None of the progeny plants produced fruit which changed colour to 
yellow rather than red during ripening. 

The results are summarised in Table 1 below. The incidence of yellow, or 

30 mixed yellow/red (for example, striped) fruits is indicative of suppression of phytoene 
synthesis. Thus, with the normal GTOM5 construct, 28% of the transgenic plants 
displayed the co-suppressed phenotype. All the plants carrying the modified MTOM5 
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construct of this invention had red fruit demonstrating that no suppression of phytoene 
synthesis had occurred in any of them. 



TABLE 1 





Construct 


35S-GTOM5-nos 


35S-MTOM5-nos 


Total number of fruiting plants 


39 


49 


Number of plants producing yellow fruit 


8 


0 


Number of plants producing mixed yellow and 
red fruit or temporal changes 


3 


0 


Number of plants producing red fruit 


28 


49 


% plants showing co-suppression of psyl 


28% 


0% 



FIGURE 1 

Sequence Alignment of Modified TOMS 
with the synthetic MTOM5 



TOMS 


ATG 


TCT 


GTT 


GCC 


TTG 


TTA 


TGG 


GTT 


GTT 


TCT 


30 


MTOM5 


AT6 
M 


AGC 
S 


GTG 
V 


GCA 
A 


CTT 
L 


CTT 
L 


TGG 
W 


GTG 
V 


GTG 
V 


AGC 
S 


30 


TOMS 


CCT 


TGT 


GAC 


GTC 


TCA 


AAT 


GGG 


ACA 


AGT 


TTC 


60 


MTOM5 


CCA 
P 


TGC 
C 


GAT 
D 


GTG 
V 


AGT 
S 


AAC 
N 


GGC 
G 


ACT 
T 


TCA 
S 


TTT 
F 


60 


TOMS 


ATG 


GAA 


TCA 


GTC 


CGG 


GAG 


GGA 


AAC 


CGT 


TTT 


90 


MTOM5 


ATG 
M 


GAG 
£ 


AGT 
S 


GTG 
V 


AGA 
R 


GAA 
E 


GGT 
G 


AAT 
N 


AGA 
R 


TTC 
F 


90 


TOMS 


TTT 


GAT 


TCA 


TCG 


AGG 


CAT 


AGG 


AAT 


TTG 


GTG 


120 


MTOM5 


TTC 
F 


GAC 
D 


AGT 
S 


TCT 
S 


CGT 
R 


CAC 
H 


CGT 
R 


AAC 
N 


CTT 
L 


GTT 
V 


120 


TOMS 


TCC 


AAT 


GAG 


AGA 


ATC 


AAT 


AGA 


GGT 


GGT 


GGA 


150 


MTOM5 


AGT 
S 


AAC 
N 


GAA 
E 


CGT 
R 


ATA 
I 


AAC 
N 


AGG 
R 


GGA 
G 


GGA 
G 


GGT 
G 


150 


TOMS 


AAG 


CAA 


ACT 


AAT 


AAT 


GGA 


CGG 


AAA 


TTT 


TCT 


180 



MTOM5 AAA CAG ACA AAC AAC GGT AGA AAG TTC TCA 180 
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TOMS 
5 MTOM5 

TOMS 
10 MTOM5 

TOMS 
15 MTOM5 

TOMS 
20 MTOM5 

TOMS 
25 MTOM5 

TOMS 
30 MTOM5 

TOM5 
35 MTOM5 

TOMS 
40 MTOM5 

TOM5 
45 MTOM5 

TOMS 
50 MTOM5 

TOMS 
55 MTOM5 



KQTNNGRKFS 

GTA CGG TCT GCT ATT TTG GCT ACT CCA TCT 210 

GTT AGA TCA GCA ATC CTT GCA ACA CCT AGC 210 
VRSAILATPS 

GGA GAA CGG ACG ATG ACA TCG GAA CAG ATG 240 

GGT GAG AGA ACT ATG ACT AGC GAG CAA ATG 240 
GERTMTSEQM 

GTC TAT GAT GTG GTT TTG AGG CAG GCA GCC 270 

GTG TAC GAC GTC GTA CTT CGT CAA GCT GCA 270 
VYDVVLRQAA 

TTG GTG AAG AGG CAA CTG AGA TCT ACC AAT 300 

CTA GTT AAA CGT CAG TTA CGT AGT ACT AAC 300 
LVKRQLRSTN 

GAG TTA GAA GTG AAG CCG GAT ATA CCT ATT 330 

GAA CTT GAG GTT AAA CCT GAC ATT CCA ATA 330 
ELEVKPDIPI 

CCG GGG AAT TTG GGC TTG TTG AGT GAA GCA 360 

CCT GGA AAC CTT GGA CTT CTT TCT GAG GCT 360 
PGNLGLLSEA 

TAT GAT AGG TGT GGT GAA GTA TGT GCA GAG 390 

TAC GAC AGA TGC GGA GAG GTT TGC GCA GAA 390 
YDRCGEVCAE 

TAT GCA AAG ACG TTT AAC TTA GGA ACT ATG 420 

TAC GCT AAA ACC TTC AAT TTG GGT ACC ATG 420 
YAKTFNLGTM 

CTA ATG ACT CCC GAG AGA AGA AGG GCT ATC 450 

TTG ATG ACA CCA GAA AGG CGT CGT GCA ATA 4S0 
LMTPERRRAI 

TGG GCA ATA TAT GTA TGG TGC AGA AGA ACA 480 

TGG GCT ATT TAC GTT TGG TGT AGG CGT ACT 480 
WAIYVWCRR T 

GAT GAA CTT GTT GAT GGC CCA AAC GCA TCA 510 

GAC GAG TTA GTG GAC GGA CCT AAT GCT AGT 510 
DEIiVDGPNA S 



TOMS 



TAT ATT ACC CCG GCA GCC TTA GAT AGG TGG 540 
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MTOM5 


TAC 
Y 


ATA 
I 


ACA 
T 


CCC 
P 


GCT 
A 


GCT 
A 


CTT 
L 


GAC 
D 


AGA 
R 


TGG 
W 


540 


5 


TOM5 


GAA 


AAT 


AGG 


CTA 


GAA 


GAT 


GTT 


TTC 


AAT 


GGG 


570 




MTOM5 


GAG 
E 


AAC 
N 


CGT 
R 


TTG 
L 


GAG 
E 


GAC 
D 


GTG 
V 


TTT 
F 


AAC 
N 


GGC 
G 


570 


10 


TOMS 


CGG 


CCA 


TTT 


GAC 


ATG 


CTC 


GAT 


GGT 


GCT 


TTG 


600 




MTOM5 


AGA 
R 


CCT 
P 


TTC 
F 


GAT 
D 


ATG 
M 


TTG 
L 


GAC 
D 


GGA 
G 


GCA 
A 


CTT 
L 


600 


15 


TOMS 


TCC 


GAT 


ACA 


GTT 


TCT 


AAC 


TTT 


CCA 


GTT 


GAT 


630 




MTOM5 


AGT 
S 


GAC 
D 


ACT 
T 


GTG 
V 


AGC 
S 


AAT 
N 


TTC 
F 


CCT 
P 


GTG 
V 


GAC 
D 


630 


20 


TOMS 


ATT 


CAG 


CCA 


TTC 


AGA 


GAT 


ATG 


ATT 


GAA 


GGA 


660 




MTOM5 


ATC 
I 


CAA 

Q 


CCT 
P 


TTT 
F 


CGG 
R 


GAC 
D 


ATG 
M 


ATC 
I 


GAG 
E 


GGC 
G 


660 


25 


TOMS 


ATG 


CGT 


ATG 


GAC 


TTG 


AGA 


AAA 


TCG 


AGA 


TAC 


690 




MTOM5 


ATG 
M 


AGA 
R 


ATG 
M 


GAT 
D 


CTT 
L 


CGT 
R 


AAG 

K 


TCT 
S 


CGT 
R 


TAT 
Y 


690 


30 


TOM5 


AAA 


AAC 


TTC 


GAC 


GAA 


CTA 


TAC 


CTT 


TAT 


TGT 


720 




MTOM5 


AAG 
K 


AAT 
N 


TTT 
F 


GAT 
D 


GAG 
E 


TTG 
L 


TAT 

Y 


TTG 
L 


TAC 
Y 


TGC 
C 


720 


35 


TOM5 


TAT 


TAT 


GTT 


GCT 


GGT 


ACG 


GTT 


GGG 


TTG 


ATG 


750 




MTOM5 


TAC 
Y 


TAC 
Y 


GTG 
V 


GCA 
A 


GGA 
G 


ACC 
T 


GTG 
V 


GGC 
G 


CTT 
Xi 


ATG 
M 


750 


40 


TOMS 


AGT 


GTT 


CCA 


ATT 


ATG 


GGT 


ATC 


GCC 


CCT 


GAA 


780 




MTOM5 


TCA 
S 


GTG 
V 


CCT 
P 


ATC 
I 


ATG 
M 


GGA 
G 


ATT 

I 


GCA 
A 


CCA 
P 


GAG 
E 


780 


45 


TOMS 


TCA 


AAG 


GCA 


ACA 


ACA 


GAG 


AGC 


GTA 


TAT 


AAT 


810 




MTOM5 


AGT 
S 


AAA 

K 


OCT 
A 


ACT 
T 


ACT 
T 


GAA 
E 


TCT 
S 


GTT 
V 


TAC 
Y 


ACC 
N 


810 


50 


TOMS 


GCT 


GCT 


TTG 


GCT 


CTG 


GGG 


ATC 


GCA 


AAT 


CAA 


840 




MTOM5 


GCA 
A 


GCA 
A 


CTA 
L 


GCA 
A 


TTA 
Xi 


GGT 
G 


ATA 
I 


GCT 
A 


AAC 
N 


CAG 
Q 


840 


55 


TOMS 


TTA 


ACT 


AAC 


ATA 


CTC 


AGA 


GAT 


GTT 


GGA 


GAA 


870 




MTOM5 


CTT 


ACA 


AAT 


ATC 


TTG 


AGG 


GAC 


GTG 


GGT 


GAG 


870 
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L 


T 


N 


I 


L 


R 


D 


V 


G 


E 






TOM5 


GAT 


GCC 


AGA 


AGA 


GGA 


AGA 


GTC 


TAC 


TTG 


CCT 


900 


5 


MTOM5 


GAC 
D 


GCA 
A 


CGT 
R 


AGG 
R 


GGT 
G 


CGT 
R 


GTG 

w X w 

V 


TAT 

X A X 

Y 


wx V* 
Ii 


CCA 
P 






TOMS 


CAA 


GAT 


GAA 


TTA 


GCA 


CAG 


GCA 


GGT 


CTA 


TCC 


930 


10 


MTOH5 


CAG 
Q 


GAC 
D 


GAG 
E 


CTC 
If 


GCT 
A 


CAA 

Q 


GCT 

wv X 

A 


GGA 
wA 

G 


X ±\9 
Is 


Awl 

s 






TOM5 


GAT 


GAA 


GAT 


ATA 


TTT 


GCT 


GGA 


AGG 


GTG 


ACC 


960 


15 


MTOM5 


GAC 


GAG 
E 


GAC 
D 


ATT 

T 


TTC 


GCA 

A 


GGT 

CI 


CGT 

XV. 


GTT 

V 


ACA 

T 


960 




TOM5 


GAT 


AAA 


TGG 


AGA 


ATC 


TTT 


ATG 


AAG 


AAA 


CAA 


990 


20 


MTOM5 


GAC 


AAG 


TGG 


AGG 


ATT 


TTC 


ATG 


AAA 


AAG 


CAG 


990 




TOM5 


ATA 


CAT 


AGG 


GCA 


AGA 


AAG 


TTC 


TTT 


GAT 


GAG 


1020 


25 


MTOM5 


ATT 
I 


pip 
H 


tvj X 

R 


lav. X 

A 


Clvl 

R 


TA ^ » 

AAA 
K 


TTT 
F 


TTC 
F 


GAC 
D 


GAA 
E 


1020 




TOMS 


GCA 


GAG 


AAA 


GGC 


GTG 


ACA 


GAA 


TTG 


AGC 


TCA 


10S0 


30 


MTOM5 


A 


E 


AAG 

K 


G 


GTT 
V 


AL x 

T 


uAb 

E 


CTT 


TCT 

S 


AGT 
S 


1050 




TOMS 


GCT 


AGT 


AGA 


TTC 


CCT 


GTA 


TGG 


GCA 


TCT 


TTG 


1080 


35 


MTOM5 


GCA 
A 


TCA 
S 


AGG 
R 


XXX 

F 


CCA 

WwA 
P 


GTT 
«i x 

V 


XlaVar 

w 


LrCC 

A 


ALrC 

3 


CTT 
L 


1080 




TOMS 


GTC 


TTG 


TAC 


CGC 


AAA 


ATA 


CTA 


GAT 


GAG 


ATT 


1110 


40 


MTOMS 


GTG 
V 


CTC 
L 


TAT 
Y 


AGA 
R 


AAG 
K 


ATT 
AX X 

I 


X 1 

L 


VjAU 
D 


E 


ATC 
I 


1110 




TOMS 


GAA 


GCC 


AAT 


GAC 


TAC 


AAC 


AAC 


TTC 


ACA 


AAG 


1140 


45 


MTOM5 


GAG 
E 


GCT 
A 


AAC 

N 


GAT 
D 


TAT 
Y 


AAT 

N 


X&rn 
AA1 

N 


TTT 
F 


AC 1 

T 


AAA 
K 


1140 




TOM5 


AGA 


GCA 


TAT 


GTG 


AGC 


AAA 


TCA 


AAG 


AAG 


TTG 


1170 


50 


MTOMS 


CGT 
R 


GCT 
A 


TAC 
Y 


GTT 
V 


TCT 
S 


AAG 
K 


AGC 

s 


AAA 

AAA 

K 


AAA 

AAA 

K 


C x X 

L 


1 1 in 




TOM5 


ATT 


GCA 


TTA 


CCT 


ATT 


GCA 


TAT 


GCA 


AAA 


TCT 


1200 


55 


MTOMS 


ATC 
I 


GCT 
A 


CTT 
L 


CCA 
P 


ATC 
I 


GCT 
A 


TAC 
Y 


GCT 
A 


AAG 
K 


AGC 
S 


1200 




TOM5 


CTT 


GTG 


CCT 


CCT 


ACA 


AAA 


ACT 


GCC 


TCT 


CTT 


1230 
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MTOM5 TTG GTT CCA CCA ACT AAG ACA GCT AGC TTG 123 0 

LVPPTKTASL 

TOMS CAA AGA TAA 1239 



MTOM5 CAG AGG TGA 1239 

Q R * 



10 . «= Same Base 

-= Different Base 



15 



DNA SEQUENCE: 63% HOMOLOGY 

PROTEIN SEQUENCE: 100% HOMOLOGY 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: ZENECA LIMITED . , . . 
(ii) TITLE OF INVENTION: ENHANCEMENT OF GENE EXPRESSION 
(iii) NUMBER OF SEQUENCES: 3 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: IP DEPT., ZENECA AGROCHEMICALS , 

(B) STREET: JEALOTTS HILL RESEARCH STATION , 

(C) CITY: BRACKNELL, 

(D) STATE: BERKSHIRE 

(E) COUNTRY: GB 

(F) ZIP: RG42 6ET 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS /MS -DOS 

(D) SOFTWARE; Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: WO NOT KNOWN 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORN E Y / AGENT INFORMATION: 

(A) NAME: HUSKISSON, FRANK M 

(C) REFERENCE /DOCKET NUMBER : PPD 50156/WO 

(ix) TELECOMMUNICATION INFORMATION: 
(A) TELEPHONE: 01344 414822 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1239 base pairs 



SUBSTITUTE SHEET (RULE 26) 



/ 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: SYNTHETIC DNA 



10 



15 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

ATGAGCGTGG CACTTCTTTG GGTGGTGAGC CCATGCGATG TGAGTAACGG CACTTCATTT 
60 



ATGGAGAGTG TGAGAGAAGG TAATAGATTC TTCGACAGTT CTCGTCACCG TAACCTTGTT 
120 



AGTAACGAAC GTATAAACAG GGGAGGAGGT AAACAGACAA ACAACGGTAG AAAGTTCTCA 
20 180 



GTTAGATCAG CAATCCTTGC AACACCTAGC GGTGAGAGAA CTATGACTAG CGAGCAAATG 
240 

25 GTGTACGACG TCGTACTTCG TCAAGCTGCA CTAGTTAAAC GTCAGTTACG TAGTACTAAC 
300 



GAACTTGAGG TTAAACCTGA CATTCCAATA CCTGGAAACC TTGGACTTCT TTCTGAGGCT 
360 



TACGACAGAT GCGGAGAGGT TTGCGCAGAA TACGCTAAAA CCTTCAATTT GGGTACCATG 
420 



TTGATGACAC CAGAAAGGCG TCGTGCAATA TGGGCTATTT ACGTTTGGTG TAGGCGTACT 
35 480 



GACGAGTTAG TGGACGGACC TAATGCTAGT TACATAACAC CCGCTGCTCT TGACAGATGG 
540 

40 GAGAACCGTT TGGAGGACGT GTTTAACGGC AGACCTTTCG ATATGTTGGA CGGAGCACTT 
600 
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AGTGACACTG TGAGCAATTT CCCTGTGGAC ATCCAACCTT TTCGGGACAT GATCGAGGGC 
660 

ATGAGAATGG ATCTTCGTAA GTCTCGTTAT AAGAATTTTG ATGAGTTGTA TTTGTACTGC 
720 

TACTACGTGG CAGGAACCGT GGGCCTTATG TCAGTGCCTA TCATGGGAAT TGCACCAGAG 
780 

AGTAAAGCTA CTACTGAATC TGTTT AC AC C GCAGCACTAG CATTAGGTAT AGCTAAC C AG 
840 

CTTACAAATA TCTTGAGGGA CGTGGGTGAG GACGCACGTA GGGGTCGTGT GTATCTCCCA 
900 

CAGGACGAGC TCGCTCAAGC TGGATTGAGT GACGAGGACA TTTTCGCAGG TCGTGTTACA 
960 

GACAAGTGGA GGATTTTCAT GAAAAAGCAG ATTCACCGTG CTCGTAAATT TTTCGACGAA 
1020 

GCTGAAAAGG GAGTTACTGA GCTTTCTAGT GCATCAAGGT TTCCAGTTTG GGCCAGCCTT 
1080 

GTGCTCTATA GAAAGATTTT GGACGAAATC GAGGCTAACG ATTATAATAA TTTTACTAAA 
1140 

CGTGCTTACG TTTCTAAGAG CAAAAAACTT ATCGCTCTTC CAATCGCTTA CGCTAAGAGC 
1200 

TTGGTTCCAC CAACTAAGAC AGCTAGCTTG CAGAGGTGA 
1239 

(2) INFORMATION FOR SBQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1239 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(iii) HYPOTHETICAL : NO 
(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: LYOPERSICON ESCULENTUM (TOMATO) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: GTOM5 - PHYTOENE SYNTHASE GENE 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

ATGTCTGTTG CCTTGTTATG GGTTGTTTCT CCTTGTGACG TCTCAAATGG GACAAGTTTC 
60 

ATGGAATCAG TCCGGGAGGG AAACCGTTTT TTTGATTCAT CGAGGCATAG GAATTTGGTG 
120 

TCCAATGAGA GAATCAATAG AGGTGGTGGA AAGCAAACTA ATAATGGACG GAAATTTTCT 
180 



GTACGGTCTG CTATTTTGGC TACTCCATCT GGAGAACGGA CGATGACATC GGAACAGATG 
240 

GTCTATGATG TGGTTTTGAG GCAGGCAGCC TTGGTGAAGA GGCAACTGAG ATCTACCAAT 
300 

GAGTTAGAAG TGAAGCCGGA TATACCTATT CCGGGGAATT TGGGCTTGTT GAGTGAAGCA 
360 

TATGATAGGT GTGGTGAAGT ATGTGCAGAG TATGCAAAGA CGTTTAACTT AGGAACTATG 
420 

CTAATGACTC CCGAGAGAAG AAGGGCTATC TGGGCAATAT ATGTATGGTG CAGAAGAACA 
480 

GATGAACTTG TTGATGGCCC AAACGCATCA TATATTACCC CGGCAGCCTT AGATAGGTGG 
540 
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GAAAATAGGC TAGAAGATGT TTTCAATGGG CGGCCATTTG ACATGCTCGA TGGTGCTTTG 
600 

TCCGATACAG TTTCTAACTT TCCAGTTGAT ATTCAGCCAT TCAGAGATAT GATTGAAGGA 
5 660 

ATGCGTATGG ACTTGAGAAA ATCGAGATAC AAAAACTTCG ACGAACTATA CCTTTATTGT 
720 

10 TATTATGTTG CTGGTACGGT TGGGTTGATG AGTGTTCCAA TTATGGGTAT CGCCCCTGAA 
780 

TCAAAGGCAA CAACAGAGAG CGTATATAAT GCTG CTTTGG CTCTGGGGAT CGCAAATCAA 
840 

15 

TTAACTAACA TACTCAGAGA TGTTGGAGAA GATGCCAGAA GAGGAAGAGT CTACTTGCCT 
900 

CAAGATGAAT TAGCACAGGC AGGTCTATCC GATGAAGATA TATTTGCTGG AAGGGTGACC 
20 960 

GATAAATGGA GAATCTTTAT GAAGAAACAA ATACATAGGG CAAGAAAGTT CTTTGATGAG 
1020 

25 GCAGAGAAAG GCGTGACAGA ATTGAGCTCA GCTAGTAGAT TCCCTGTATG GGCATCTTTG 
1080 

GTCTTGTACC GCAAAATACT AGATGAGATT GAAGCCAATG ACTACAACAA CTTCACAAAG 
1140 

30 

AGAGCATATG TGAGCAAATC AAAGAAGTTG ATTGCATTAC CTATTGCATA TGCAAAATCT 
1200 

CTTGTGCCTC CTACAAAAAC TGCCTCTCTT CAAAGATAA 
35 1239 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 402 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
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(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: LYOPERSICONN ESCULENTUM (TOMATO) 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: TRANSLATION PRODUCT OF GTOM5 AND MTOM5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

Met Ser Val Ala Leu Leu Trp Val Val Ser Pro Cys Asp Val Ser Asn 
15 10 15 

Gly Thr Ser Phe Met Glu Ser Val Arg Glu Gly Asn Arg Phe Phe Asp 
20 25 30 

Ser Ser Arg His Arg Asn Leu Val Ser Asn Glu Arg lie Asn Arg Gly 
35 40 45 

Gly Gly Lys Gin Thr Asn Asn Gly Arg Lys Phe Ser Val Arg Ser Ala 
50 55 60 

lie Leu Ala Thr Pro Ser Gly Glu Arg Thr Met Thr Ser Glu Gin Met 
65 70 75 80 

Val Tyr Asp Val Val Leu Arg Gin Ala Ala Leu Val Lys Arg Gin Leu 
85 90 95 

Arg Ser Thr Asn Glu Leu Glu Val Lys Pro Asp lie Pro lie Pro Gly 
100 105 110 

Asn Leu Gly Leu Leu Ser Glu Ala Tyr Asp Arg Cys Gly Glu Val Cys 
115 120 125 

Ala Glu Tyr Ala Lys Thr Phe Asn Leu Gly Thr Met Leu Met Thr Pro 
130 135 140 
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Glu Arg Arg Arg Ala lie Trp Ala lie Tyr Val Trp Cys Arg Arg Thr 
145 150 155 160 

Asp Glu Leu Val Asp Gly Pro Asn Ala Ser Tyr lie Thr Pro Ala Ala 
5 165 170 175 

Leu Asp Arg Trp Glu Asn Arg Leu Glu Asp Val Phe Asn Gly Arg Pro 
180 185 190 

10 Phe Asp Met Leu Asp Gly Ala Leu Ser Asp Thr Val Ser Asn Phe Pro 

195 200 205 

Val Asp lie Gin Pro Phe Arg Asp Met lie Glu Gly Met Arg Met Asp 
210 215 220 



15 



30 



Leu Arg Lys Ser Arg Tyr Lys Asn Phe Asp Glu Leu Tyr Leu Tyr Cys 
225 230 235 240 



Tyr Tyr Val Ala Gly Thr Val Gly Leu Met Ser Val Pro lie Met Gly 
20 245 250 255 

lie Ala Pro Glu Ser Lys Ala Thr Thr Glu Ser Val Tyr Asn Ala Ala 
260 265 270 

25 Leu Ala Leu Gly lie Ala Asn Gin Leu Thr Asn lie Leu Arg Asp Val 

275 280 285 

Gly Glu Asp Ala Arg Arg Gly Arg Val Tyr Leu Pro Gin Asp Glu Leu 
290 295 300 



Ala Gin Ala Gly Leu Ser Asp Glu Asp lie Phe Ala Gly Arg Val Thr 
305 310 315 320 



lie His Arg Ala Arg Lys Phe Phe Asp Glu Ala Glu Lys Gly Val Thr 
35 325 330 335 

Glu Leu Ser Ser Ala Ser Arg Phe Pro Val Trp Ala Ser Leu Val Leu 
340 345 350 

40 Tyr Arg Lys lie Leu Asp Glu lie Glu Ala Asn Asp Tyr Asn Asn Phe 

355 360 365 
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Thr Lys Arg Ala Tyr Val Ser Lys Ser Lys Lys Leu lie Ala Leu Pro 
370 375 380 

lie Ala Tyr Ala Lys Ser Leu Val Pro Pro Thr Lys Thr Ala Ser Leu 
385 390 395 400 

Gin Arg 
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CLAIMS 

1 . A method of enhancing expression of a selected protein by an organism having 
a gene which produces said protein, comprising inserting into the genome of 
5 the said organism a DNA the nucleotide sequence of which is such that the 

RNA produced on transcription is different from but the protein produced on 
translation is the same as that expressed by the gene already present in the 
genome. 

10 2. A method as claimed in claim 1 , in which the organism is a plant. 

3. A method as claimed in claim 2, in which the plant is a tomato plant. 

4. A method as claimed in any preceding claim, in which the selected gene is the 
15 gene encoding phytoene synthase. 

5. A method as claimed in claim 4, in which the coding region of the said inserted 
gene has the sequence SEQ-ID-NO-1. 

20 6. A gene construct comprising in sequence a promoter which is operable in a 
target organism, a coding region encoding a protein and a termination signal 
characterised in that the nucleotide sequence of the said construct is such that 
the RNA produced on transcription is different from but the protein produced 
on translation is the same as that expressed by the gene already present in the 

25 genome. 

7. A method of enhancing expression of caroteniods in a plant comprising 
overexpression in the plant a gene specifying an enzyme necessary to the 
biosynthesis of carotenoids, the said overexpression being achieved by the use 
30 of a modified transgene having a different nucleotide sequence from the 

endogenous sequence. 
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8. A method as claimed in claim 7, in which the modified gene specifies phytoene 
synthase. 

9. A modified chloroplast targeting sequence comprising nucleotides 1 to 417 of 
5 SEQ-ID-NO-1 
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FIGURE 3 
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MTOM5 encodes phytoene synthase 
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